Speech Recognition and Elimination the Noise Based on MFCC
نویسندگان
چکیده
he voice recognition is the method that calculates an optimal match between two given sequences with certain restrictions is called Dynamic time wrapping. The sequences are "warped" non-linearly in the time dimension to determine a measure of their similarity independent of certain non-linear variations in the time dimension. This sequence alignment method is often used in time series classification. Although DTW measures a distance-like quantity between two given sequences, it doesn't guarantee the triangle inequality to hold. The voice recognition is the ability of a machine to recognize the spoken words and convert them to any desired form. In the current scenario when we are moving towards the automated world, the applications of real-time voice recognition are increasing day by day. The voice recognition system is a good choice to give a voice command for any device which requires user inputs to operate. Lifts, television, gaming-stations, smart-phones and medical instruments are the few of the many such examples. The real time voice recognition system first requires some training and then is ready to recognize the real time voice data input. For a new incoming voice command, the system tries to match its features from the existing data set. The command is then classified into the ‘best-matched’ command from the existing data set. The technological advancement in the field of pattern recognition had made the voice recognition more reliable
منابع مشابه
Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملAuditory model based speech recognition in noisy environment
The main purpose of this paper is to present how to raise the speech recognition performance in noisy environment. So far the most popularly used speech feature in speech recognition is probably the so-called MFCC. The recognition rate of speech recognition algorithm using MFCC and CDHMM is known to be very high in clean speech environment, but it deteriorates greatly in noisy environment, espe...
متن کاملEntropy based combination of tandem representations for noise robust ASR
In this paper, we present an entropy based method to combine tandem representations of the recently proposed Phase AutoCorrelation (PAC) based features and MelFrequency Cepstral Coefficients (MFCC) features. PAC based features, derived from a nonlinear transformation of autocorrelation coefficients and shown to be noise robust, improve their robustness to additive noise in their tandem represen...
متن کاملRobust speech/non-speech detection using LDA applied to MFCC for continuous speech recognition
Continuous speech recognition applications need precise detection because the number of words to recognize is unknown and vocabulary words can be short. The speech/non-speech detection must be robust to the boundary precision. In this work, a new approach to evaluate detection algorithm for continuous speech recognition is presented. The speech/non-speech detection using energy parameter combin...
متن کاملRobust speech/non-speech detection using LDA applied to MFCC
In speech recognition, a speech/non-speech detection must be robust to noise. In this work, a new method for speech/nonspeech detection using a Linear Discriminant Analysis (LDA) applied to Mel Frequency Cepstrum Coefficients (MFCC) is presented. The energy is the most discriminant parameter between noise and speech. But with this single parameter, the speech/non-speech detection system detects...
متن کامل